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Abstract 

Emergence of dominating cliques in Erdos-Renyi random graph model 
G(n,p) is investigated in this paper. It is shown this phenomenon pos- 
sesses a phase transition. Namely, we have argued that, given a con- 
stant probability p, an n-node random graph G from G(n,p) and for 
r = clog 1//p n with 1 < c < 2, it holds: (1) if p > 1/2 then an r-node 
clique is dominating in G almost surely and, (2) if p < (3 — v5) /2 then an 
r-node clique is not dominating in G almost surely. The remaining range 
of probability p is discussed with more attention. A detailed study shows 
that this problem is answered by examination of sub-logarithmic growth 
of r upon n. 

Keywords: Random graphs, dominating cliques, phase transition. 
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1 Introduction 



The phase transition phenomenon was originally observed as a physical effect. 
In discrete mathematics, it was originally described by P. Erdos and A. Renyi 
in [5]. The most frequently property of graphs which have been studied with 
relation to the phase transitions in random graphs is the connectivity. The 
recent surveys of known results concerning this area can be find in Refs. [5] and 
0, Chapter 5. 

Our paper deals with another interesting graph problem that is the emerging 
of a dominating clique in a random graph. The theory of dominating cliques 
in random graphs has several nontrivial applications in computer science. The 
most significant ones are: (1) heuristics in satisfiability search [5] and (2) the 
construction of a space-efficient interval routing scheme with a small additive 
stretch for almost all and large-scale distributed systems [T5] . 

1.1 Preliminaries and terminology 

Given a graph G = {V,E), a set S C V is said to be a dominating set of G if 
each node v € V is either in S or is adjacent to a node in S. The domination 
number 7(G) is the minimum cardinality of a dominating set of G. 

A clique in G is a maximal set of mutually adjacent nodes of G, i.e., it is 
a maximal complete subgraph of G. The clique number, denoted cl(G), is the 
number of nodes of clique of G. If a subgraph S induced by a dominating set is 
a clique in G then S is called a dominating clique in G. 

The model of random graphs is introduced in the following way. Let n be 
a positive integer and let p G K, < p < 1, be a probability of an edge. The 
(probabilistic) model of random graphs G(n,p) consists of all graphs with n-node 
set V = { 1 , . . . , n} such that each graph has at most (™) edges being inserted 
independently with probability p. Consequently, if G is a graph with node set 
V and it has |£?(G)| edges, then a probability measure Pr defined on G(n,p) is 
given by: 

PiiG]=p^(l-pp)-^ . 
This model is also called Erdds-Renyi random graph model 

Let A be any set of graphs from G(n,p) with a property Q. We say that 
almost all graphs have the property Q iff: 

PrL4] — * 1 as n — > 00 . 

The term "almost surely" stands for "with the probability approaching 1 as 
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1.2 Previous work and our result 

Dominating sets and cliques are basic structures in graphs and they have been 
investigated very intensively. To determine whether the domination number 
of a graph is at most r is an NP-complete problem [6]. The maximum-clique 
problem is one of the first shown to be NP-hard [TT]. A well-known result of 
B. Bollobas, P. Erdos et al. states that the clique number in random graphs 
G(n,p) is bounded by a very tight bounds [H El HOI HH US HH- Let 6 = 1/p 
and let 

r = log h n-2 log 6 log h n + \og b 2 + log b log & e , (1) 

n = 2 log b n - 2 log fc log b n + 2 log 6 e + 1 - 2 log b 2 . (2) 

J. G. Kalbfleisch and D. W. Matula [TQl [12] proved that a random graph from 
G(n,p) does not contain cliques of the order greater than [Yi] and less or equal 
than [ra\ almost surely. (See also (H1H3HS1-) The domination number of a 
random graph have been studied by B. Wieland and A. P. Godbole in [IT]. 

The phase transition of dominating clique problem in random graphs was 
studied independently by M. Nehez and D. Olejar in [T21U31 and J. C. Culberson, 
Y. Gao, C. Anton in [5]. It was shown in [5] that the property of having a 
dominating clique is monotone, it has a phase transition and the corresponding 
threshold probability is p* = (3 — \/5)/2. The standard first and the second 
moment methods (based on the Markov's and the Chebyshev's inequalities, 
respectively, see [211]) were used to prove this result. However, the preliminary 
result of M. Nehez and D. Olejar [14] pointed out that to complete the behavior 
of random graphs in all spectra of p needs a more accurate analysis, namely 
in the case when (3 - VE)/2 < p < 1/2. The main result of this paper is the 
refinement of the previous results from [14] . Let us formulate this as the 
following theorem. 

Theorem 1 Let < p < 1 be fixed and let TLx denote log 1 /n_ p j x. Let r be 
order of a clique such that [ro\ < r < . Let 5(n) : IN — » IN be an arbitrary 
slowly increasing function such that S(n) — o(log77) and let G € G(n,p) be a 
random graph. Then: 

1. If p > 1/2, then an r-node clique is dominating in G almost surely; 

2- If p < (3 — y/E)/2, then an r-node clique is not dominating in G almost 
surely; 

3. If (3 — v5)/2 < p < 1/2, then an r-node clique: 

• is dominating in G almost surely, if r > JLn + S(n), 

• is not dominating in G almost surely, if r < TLn — S(n), 

• is dominating with a finite probability f(p) for a suitable function 
/: [0,1] -f [0,1], ifr = TLn + 0(l). 
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To prove Theorem [T] the first and the second moment method were used. The 
leading part of our analysis follows from a property of a function defined as a 
ratio of two random variables which count dominating cliques and all cliques 
in random graphs, respectively. The critical values of p: (3 — v5)/2 and 1/2, 
respectively, are obtained from the bounds fl}, ([2|) see [I0|,[T2"]. 

The rest of this paper contains the proof of the Theorem[TJ Section 2 contains 
the preliminary results. An expected number of dominating cliques in G(n,p) 
is estimated here. The main result is proved in section 3. Possible applications 
are discussed in section 4. 

2 Preliminary results 

For r > 1, let S be an r-node subset of an n-node graph G. Let A denote the 
event that "5 1 is a dominating clique of G £ G(n,p)" . Let in r be the associated 
0-1 (indicator) random variable on G(n,p) defined as follows: in r = 1 if G 
contains a dominating clique S and in r — 0, otherwise. Let X r be a random 
variable that denotes the number of r-node dominating cliques. More precisely, 
X r = ^2 i n r where the summation ranges over all sets S. The following lemma 
expresses the expectation of X r . 

Lemma 1 fiy]j The expectation E(X r ) of the random variable X r is given by: 
E(X r )=(fjp®(l-p r -(l- P y) n - r . (3) 

We use the following properties adopted from [To] , pp. 501-502. 

Claim 1. Let < p < 1 and k < (n — n < starting with some positive 

integer n. Then: 

(1 - p k ) n = exp(-np fe ) (1 + 0{np 2k )) = 1 - np k + O (np 2k ) . 
Claim 2. Let k = o(y/n), then: 

n k = n(n-l)---(n-k+l) = n k h-Ql + O 

The upper bound on r in G(n,p) is stated in the following lemma. 

Lemma 2 Let b = 1/p and 

r u = 21og fe n-21og 6 lo gi ,n + 21og 6 e+l-21og fe 2 . (4) 

A random graph from G(n,p) does not contain dominating cliques of the order 
greater than r u with probability approaching 1 as n — > oo. 
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Remark 1 Note that the upper bounds r u and r\ are the same. The argument 
for estimation of r\ is the same as in Lemma\^ 

To obtain conditions for an existence of dominating cliques in random graphs 
it is sufficient to estimate the variance Var(X r ). We can use the fact that the 
clique number in random graphs lyes down in a tight interval. We use the 
bounds fTJ) and (T2|) due to [THIHI]- The estimation of the variance Var(X r ) is 
stated in the following lemma. 

Lemma 3 Let p be fixed, < p < 1 and [ro\ < r < \r{] . Let 
/3 = min{ 2/3, -21og 6 (l-p) } . 

Then: 

Var{X r )=E{X r f.o[^^j . (5) 

The following claim expresses the number of the dominating cliques in ran- 
dom graphs. 

Lemma 4 Let p, r and (3 be as before, and 

X^(») P (3(l- p -'-(l- rtT -'x{l + o(<!£|#)}^ (6) 

The probability that a random graph from G(n,p) contains X r dominating 
cliques with r nodes is \ — O ((logn)~ 3 ) . 



3 Proof of Theorem 1 

For r > 1, let Y r be the random variable on G{n,p) which denotes the number 
of ?'-node cliques. According to [15] , 

*-(^l-,,~ X { 1 + (2^)}. (7, 

The ratio X r /Y r expresses the relative number of dominating cliques (with r 
nodes) to all cliques (with r nodes) in G(n,p) and it attains a value in the 
interval [0, 1]. By analysis of the asymptotic of X r /Y r as n tends oo we obtain 
our main result. 

Let us examine the limit value of the ratio X r /Y r : 

X r _ fl-p r -(l-p) 



Y r V 1-P r 



n—r 

X 
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The most important term of the expression ([5]) is the first one, since the last 
two terms tend to 1 as n — > oo. Let us define a : [0, 1] — > IR by: 

a(p) = -log 1/p (l -p) . 

The plot of its graph is in fig. [1] and for the simplification, we will write also a 
instead of a(p). Note that 

(i- P y = P ra . (9) 




Figure 1: The graph of the function a(p) = — log 1 / p (l —p). 



According to Claim 1 and ((H) we have: 

■i- P r -(i- P y 



l -p r 



= l- 



p 



exp 



-rip 



1 - p r 



1 + O (np 2ra ) 



1 + 



1 -p r 

(logn) 2 +« 



exp 



-np 



[l + 0(np 2ra )] . 



1 ~p r 

Let us analyze the asymptotic behavior of the ratio X r /Y r as n tends to oo. 
According to the assumption n — > oo, we can write X r /Y r in the following two 
equivalent forms: 



exp 



np 



or, applying fTTjl. as: 



exp 



1 -p r 

n(l — p) 
l-p r 



Using bounds §T§ and ([2|) , the admissible number of nodes of a clique r depends 
on n as (we consider the leading term only): 



(10) 
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where 1 < p < 2. This results in: 

X r 



exp 



and one has three different cases: 

f . 1 - pa < 0, V p E [1 , 2] p > | , 

2. 1 - pa > 0, V p G [f , 2] p < 2=^1 , 

3. 1 — pa changes sign as p varies in [f , 2] 3 ~ 2 v/ ^ < p < \ ■ 
The first case implies 

lim -f = 1, 

that means the r— node cliques is dominating in G almost surely. The second 
case implies 

hm — = 0, 

n — >oo Y r 

and therefore a r— node clique is not dominating in G almost surely. In the 
third case, there exists a value of p (for each p) in the interval [1,2]: 



a{p) 



for which we have: 
and 



r = p\og b n = log 1/(1 



/(i-p) 



lim — - = exp (— nil — p) r ) = e 1 . 



The ratio X r /Y r approaches 1 (0) for p > p (p < p). Due to corrections of order 
less than 9(logn) to the equation (flT)]) taken with p — p the value of e" 1 to be 
changed to another constant greater or equal than and less or equal than 1. 
The details are given here. Let 5{n) : IN — > INT be an increasing function such 
that 8{n) = o(logn). 

If r = plog b n + S(n), then X r /Y r approaches 1 as exp (— (1 — p) s ^) . 

If r = plog b n — S(n), then X r /Y r approaches as exp (—(1 — p)~ s ^) . 

And finally, if r differs from plog b n by a constant A, then the ratio X r /Y r 
asymptotically looks like exp(— (1 — p) A ). 
The proof is complete. 



7 



. 8 
. 6 

H 
>H 

^ 0.4 

x 

0.2 





20 40 60 80 

n 

Figure 2: The plot of the fraction X r /Y r versus n for three different choices of 
p in the intermediate case when 3 ~ 2 < P — \ ■ I n au three cases p is set to be 
0.45 and p varies (from the top to the bottom) as: p — 1.9, p — l/a(0.45), and 
finally p = 1.05. 



4 Discussion 

We have claimed the conditions for the existence of dominating cliques in Erdos- 
Renyi random graph model. Our result is the refinement of the previous ones 
from M M- 

For possible applications of this result we address the two works of J. C. 
Culberson, Y. Gao, C. Anton [5j and M. Nehez and D. Olejar [13] . The paper 
[5] deals with heuristics in satisfiability search. For the second application, 
described in [13] , we mention the construction of a space-efficient interval routing 
scheme with a small additive stretch in almost all networks modelled by random 
graphs G(n,p) where p > 1/2. An application of this result can be found in 
decentralized content sharing systems based on the peer-to-peer (shortly P2P) 
paradigm such as Freenet which uses the idea of interval routing for retrieving 
files from local datastores according to keys [4] . 

Acknowledgement. This work has been supported by Gratex Research, 
Bratislava, by CU grant No. 403/2007 and by the VEGA grant No. 1/3042/06. 
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Appendix 



Proof of Lemma 2. 

The proof follows from the Markov's inequality 9 , p. 8: 

Prf X > t 1 < ^1 , t > . 
l - j - t 



Let us denote a — log 1 / p ( jz^j — — log&(l ~ p). Note that: 

(i- P y = P ™. (ii) 

Let r — (2 — e)log h ri, where < e < 1. According to Claim 1 we have three 
cases: p > 1/2, p = 1/2 and p < 1/2. The first two of them can be analyzed 
together, performing elementary computations we obtain: 

(l-p r -(l-p) r ) n - r «l-n 6 - 1 -» B ^ °l, if p>\. 



In the case p < 1/2 the same kind of algebra shows that 



(1 - p r - (1 - p) r ) n - r wexp 



(2-Q ln(l-p) 



n la M 



1 

if P<- 2 - 



We distinguish two different asymptotics in the previous formula. For given 
p < 1/2 they are separated by the condition 

This is solved with respect to e as: 

ln(p) 



e = 2 - 



In(l-p)' 
Now we have: 

• for e > e 

(1 -p r - (1 -p) r )"- r ^ as n^oo, 

• for e < e 

(1 - p r - (1 - pf) n - r 1 as n^oo, 

With respect to upper and lower bound on size of a dominating clique we require 
e ranges between and 1. This requirement defines then two critical values of 
the probability p: 



• e = 1 - in this case 



P= 2' 
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e = - in this case _ 

3- 



The Stirling's formula (e.g. [IB] , p. 127) yields to: 



n Ud „ ( ™p {r ~ 1)/2 \ . (12) 



Consequently, 



"V*)-! and ( " y-V 1 )^!^ 



K r u + 1, 

The rest follows from the Markov's inequality ^ for t = 1. 

Proof of Lemma 3. 

In order to prove this lemma we will estimate the variance of X r : 

Var{X r ) = E{X 2 ) - E 2 (X r ) . (13) 
The expectation of X 2 can be expressed in the following way: 



. r \jj \r-jj 

x(l-p r -(l- pffn-ir+lj . prp^ S3 . (14) 

The equation (JT^J) follows from the next analysis. The nodes of the first dom- 
inating clique S* can be chosen in (") ways. The dominating cliques S$, S% 
can (but need not to) have j common nodes. These nodes can be chosen in (p 
ways. The remaining (r — 1) nodes of the second dominating clique S 2 have 
to be chosen from (n — r) nodes of V(G) \ V(S^). Now we shall choose edges: 
both dominating cliques are r-node complete graphs and therefore they contain 
2(2) edges. But S*, S 2 can have a nonempty intersection - a complete j-node 
subgraph. Therefore (2) edges were counted twice. Both subgraphs S$, S 2 are 
dominating cliques and so all n— 2r + j nodes of the set V(G) \ [V(S^) U V(S 2 )] 
are "good" with respect to both S*, Sf. The last term, Pr[S^, S 2 } denotes the 
probability that the nodes of V(S*) \ V(S 2 ) are good with respect to S 2 and the 
nodes of V(S%) \ V(S^) are good with respect to S*. It is sufficient to estimate 
Pr[S r \S2] by 1. 

To prove that Var(X r ) is asymptotically less than E 2 (X r ), we extract the 
expression E 2 (X r ) in front of the sum stated by the equation (fT4|) . We have: 

< EHx r) . ± C)- 1 (;) . P -a) . Q(P , r , » , ( i 5) 
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where Q(p,r,j) = (1 - p r - (1 - p) r )- 2r+2 'J . 

First we estimate the expression Q(p, r,j). Let us denote a = — log b (l — p), 
as before. Recall that (1 — p) r — p ra . Let us also denote: 

v = min{l, - log 6 (l - p)} . (16) 
Therefore, from L?"oJ < t < |~ r il ( c f- [E])> Claim 1 and (fTTj) . it follows: 

Q(p,r,j)<[l-p rn -p Q ''°]" 2r < 
(log b n) 2 / (log b n) 2 



< 



1 - 



-4 log,, ri 



exp < 4 log b n 



2n-\og b e \2n-log fc e 

(log b n) 2 / (log b n) 2 



2n ■ log b e \ 2n ■ \og b e 
\+2v \ \ 



x 1 + 



(logn) 



exp 



2(log b n) J 



exp 



/4(log b n) 



n 2u 
2a+l 



V(2n-log b e)« 



1 + 



(logn) 



l+2u 



n ■ log b e 
where v = min{l,a}. Since 

2(log b rQ 3 

T \J UblbU 

n ■ log b e (2n • log b e) Q 

as n — > oo, the value of Q{p,r,j) is 1 + o(l) or, more precisely 







4(log b n) 



2a + l 







Q(p,r,j) = l + O 



/ (logn) 2 ^ 1 



2// 



Now we can concentrate our effort on the estimation of the sum 

-l 



EC 



n — r 



where: 



J J \r-J 



l+oj < r < \ri] 



■ P 



-(i) 



(17) 



(18) 



We use a similar approach as D. Olejar and E. Toman in [15], pp. 504-506. 
This sum was also estimated in Subsection 5.3. of [16] (pp. 77-80), but we need 
more accurate calculation here. First we introduce the following notation: 



S(n,r,c,d) = ^ 



r \ I n — r 
j) \r - j 



66) . 
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Our solution is based on the idea to divide the sum S(n, r, a, b) into three parts 
by the following way: 



where: 



S(n, r, 0, r) < S(n, r, 0, 1) + S(n, r, 2, r?) + S(n, r, T2,r) , 



r 2 = (1 + A) log b n for < A < 1 



(19) 



All these three parts will be estimated separately. Using Claim 2, the first part 
is estimated as follows: 



S{n,r,0,l) 



= 11-- 

n 



n — r\ in 



n — r\ In 
r-1 



l + O 



(logn)' 



v 

— + 

n 



(logn)^ 



l + O 



(logn) z 



(20) 



To estimate the second part, it is sufficient to analyze the binomial coefficients. 
(See also [16], pp. 79-80.) 



r \ I n — r \ r\ r- 



JJ \ r - J 



(n — r) 



r—3 



nt- j\ (r-j)! 



r- ■ (r — j)\ ri (n — r)?—?- 



r 2 -i 



,.2.) 



< 



( r - j) 1 - i' nt-(n-j)?—Z- j! ■ rd- 
We use the Stirling's formula in the following form: 

j 



< 



j\ ■ rd- 



j\ ■ ni 



i! 



Consequently, 



3 



r\ ln~r 
j) \r - j 



6«) 



r 2 • V' 2 ■ e 



] ■ n 



Vb 



(21) 



The members of the sum S(n, r, 2,r2) attain their asymptotic maximum for 
j =r%. More precisely, letting j = r% = (1 + A) log h n we have: 



r 2 • V' 2 ■ e 
j ■ n ■ \fb 



= 



logn 

,l/2-A/2 



Thus, 



S(n,r,2,r 2 ) < 



ci ■ logn\ / c\ ■ logn 



,i/2-a/2 



,l/2-A/2 



Ci • logn ' 

n l/2-A/2 
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for a suitable constant c\. It yields: 



S(n,r,2,r 2 ) = 0[^^) . (22) 



To estimate the sum S(n,r,r2,r) we extract the term (™) 1 • J>( 2 ): 

«-.r,-»,r, - (;)" - ^ • t (,:,)(;:;) ■» CH0 • 

To obtain the upper bound on the right-hand side sum, we substitute [Yi] for 
r in its upper border and |Yi] + 1 for r in all the summands. The reasoning of 
such a substitution is the assertion of Lemma [2] and Remark [T] We have: 

Let us put k = \r{] + 1 — j. Consequently, 

S(n,r,r 2 ,r)< (23) 

k— 1 



Note that 



( [ri l + ^ ~ ■p kl[ri] - {k - 1)/2] < ((rm + 1) • npW-N/») 



A- 



and 



It yields: 



\r{\ - (k - l)/2 > [ri]/2 + r 2 /2 = 
= (3/2 + A/2) log, n - log, log, n + 0(1) 



(rnl + 1) .„ P r nl -< 4 - W = (i!5g}_) 

According to ((23j) and (|24| . 

"'■ftQ-of (logn) ; 2 

^ ^ n l/2+A/2 



(24) 



77. 

S(n,r,r 2 ,r) < 

'. r 



The term (") 1 ■ &(*) can be estimated using the Stirling's formula. The esti- 
mation is the same as in the proof of Lemma [21 see (fT2")l . Thus, 



-l 

n 

n 
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\r J n c 



if r — [n] — c, where c > 1. Hence, 

S{n,r,r 2l r) = O 
Let us summarize our results: 



(logn) 2 

n l/2+X/2 



(25) 



• Eq. I|20p shows that S(n, r, 0, 1) is close to 1 uniformly with respect to A. 

• Eq. (J22J) shows that the "mid" term S{n, r, 2, r 2 ) of the sum-splitting (|19p 
is close to zero however, non- uniformly in A. As A approaches 1 from 
the left (i.e. the node number approaches its upper bound) S(n, r, 2, r 2 ) 
decreases to zero slowly. 



• Eq. (|25p shows that S(n, r, r 2 , r) is close to zero uniformly in A. (We 
choose A = as the uniform upper bound.) 



Thus, we have: 

E{X 2 r ) =E 2 (X r 



l + O 



E 2 (X r 



(log») ; 

n 2/3 



l + O 



l + O 



(logn 



2v+l 



(logn)" - 



where v = min{ 1, — log b (l — p) } and 
/3 = min{ 2/3, -21og b (l-p) } . 

Substituting into (fl3|) we obtain the estimation of Var(X r ). 

Proof of Lemma 4. 

It follows from the Chebyshev's inequality 9J: if Var(X) exists, then: 

Var(X) 



Pt[\X-E(X)\ >t]> 



t 2 



t > 



Letting t = E(X r ) ■ (logn) 3 • n _/3 / 2 and using Lemma [31 we obtain the assertion 
of Lemma |U 
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